-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support UUIDs to pyarrow on more backends #8901
Conversation
2d5361f
to
bb2087d
Compare
accada4
to
b853a19
Compare
b853a19
to
dae0350
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to avoid mixing pyarrow and pandas conversion paths.
OK, I think this brings up a larger philosophical question: Do we want to totally separate the pandas and pyarrow codepaths, or can they rely on each other? Currently, to get pyarrow results from a backend:
I think the coupling between pandas and pyarrow for this conversion isn't inherently bad (we don't need to implement the db -> pyarrow path!), but I agree that it should be isolated, so we are very clear where we are mixing these two ecosystems, so that for the backends that don't need it, you can just have pyarrow installed, you don't need pandas. So I see two options:
I think I would lean towards 2. I want to remove reliance on pandas as much as possible. Possibly this implementation won't be that hard for these other backends. |
I think we'd to eventually be able to offer Ibis without requiring There's also the potential of using something that doesn't depend on either of those for the core (like printing tables), so I think we'd like to keep things as isolated from one another as possible. Even more is the fact that sending anything through pandas is likely to result in some kind of type or value alteration that doesn't happen with pyarrow. Especially with NULLs, pandas is likely to do something completely different and incompatible with what pyarrow would do. |
Ok, when I get back to this I'll try the db -> arrow method! |
Is this PR still viable? |
viable, I just stopped needing it personally so the urgency of it dropped a lot compared the 5 million other PRs I have open haha. Feel free to close if you want, and re-open once someone actually finds time to work on it. |
partially fixes #8902.
Implements UUID execution to pyarrow on some backends, and adds notimpl tests for the rest.